feat: configurable vector datatype (int8 quantization) for long-term memory#302
feat: configurable vector datatype (int8 quantization) for long-term memory#302silversurfer562 wants to merge 2 commits into
Conversation
Adds a REDISVL_DATATYPE setting so the long-term-memory vector index can use int8 (and other RedisVL datatypes) instead of the hardcoded float32. int8 cuts index memory ~75% and speeds search ~30% with negligible recall loss (Redis 8 Query Engine required for TYPE INT8). - config: new redisvl_datatype setting (default "float32") - factory: _build_redis_schema uses settings.redisvl_datatype, and passes it to RedisVLMemoryVectorDatabase - vector db: encode/query honor the datatype. Float types go through RedisVL's array_to_buffer; int8 is quantized first (per-vector max-abs scaling — RedisVL validates the int8 range but does not quantize). Query vectors are quantized to match and the VectorQuery/RangeQuery dtype is set accordingly. Default behavior is unchanged (float32). Adds 6 tests; full test_memory_vector_db.py suite passes (33).
|
Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset. In case there are security findings, they will be communicated to you as a comment inside the PR. Hope you’ll enjoy using Jit. Questions? Comments? Want to learn more? Get in touch with us. |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds support for configuring the RedisVL vector datatype (default float32, optional int8) and introduces int8 quantization/encoding so the stored vectors and query vectors match the configured datatype.
Changes:
- Add
redisvl_datatypesetting and thread it into Redis schema construction and DB instantiation. - Implement int8 per-vector max-abs quantization and unified vector byte encoding via
array_to_buffer. - Add tests covering default datatype behavior, quantization, encoding, and schema wiring.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| tests/test_memory_vector_db.py | Adds tests for datatype defaulting, quantization/encoding behavior, and schema datatype propagation. |
| agent_memory_server/memory_vector_db_factory.py | Wires settings.redisvl_datatype into schema creation and DB construction. |
| agent_memory_server/memory_vector_db.py | Adds datatype parameter, quantization, encoding, and passes dtype through to RedisVL queries. |
| agent_memory_server/config.py | Introduces redisvl_datatype setting with default float32. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| redisvl_vector_dimensions: str = "1536" | ||
| redisvl_index_prefix: str = "memory_idx" | ||
| redisvl_indexing_algorithm: str = "HNSW" | ||
| redisvl_datatype: str = "float32" |
| """Quantize a float embedding to int8 range for an int8 index. | ||
|
|
||
| RedisVL validates the int8 range but does not quantize; float | ||
| datatypes pass through unchanged. Per-vector max-abs scaling is | ||
| used, which COSINE distance is invariant to. | ||
| """ |
| arr = np.asarray(embedding, dtype=np.float32) | ||
| peak = float(np.max(np.abs(arr))) or 1.0 | ||
| scaled = np.clip(np.round(arr * (127.0 / peak)), -127, 127) | ||
| return scaled.astype(np.int8).tolist() |
| original = settings.redisvl_datatype | ||
| try: | ||
| settings.redisvl_datatype = "int8" | ||
| schema = _build_redis_schema() | ||
| vec = next(f for f in schema["fields"] if f.get("type") == "vector") | ||
| assert vec["attrs"]["datatype"] == "int8" | ||
| finally: | ||
| settings.redisvl_datatype = original |
…ypatch Addresses Copilot review feedback: - config: field_validator normalizes redisvl_datatype to lowercase and validates against RedisVL's VectorDataType set (rejects e.g. 'float'). - factory: raise a clear ValueError when a quantized datatype (int8/ uint8) is paired with a non-cosine distance metric, since per-vector max-abs scaling changes geometry for L2/IP. - vector db: _maybe_quantize returns an np.int8 array (no .tolist() boxing); array_to_buffer consumes it directly. - tests: use monkeypatch instead of mutating global settings; add tests for the validator (normalize + reject) and the int8/cosine guard.
|
Thanks for the review — all four points addressed in d0955b8:
Full |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Reviewed by Cursor Bugbot for commit d0955b8. Configure here.
|
|
||
| def _encode_vector(self, embedding: Any) -> bytes: | ||
| """Encode an embedding to bytes for the configured datatype.""" | ||
| return array_to_buffer(self._maybe_quantize(embedding), dtype=self._datatype) |
There was a problem hiding this comment.
uint8 config lacks quantization
Medium Severity
redisvl_datatype can be set to uint8 (validated like other RedisVL types), and _build_redis_schema treats uint8 as quantized, but _maybe_quantize only scales for int8. Indexing and search then pass raw float embeddings through array_to_buffer with dtype=uint8, so stored/query vectors won’t match a proper uint8 index.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit d0955b8. Configure here.


What
Adds a
REDISVL_DATATYPEsetting so the long-term-memory vectorindex can use int8 (and the other RedisVL datatypes) instead of
the currently hardcoded
float32.Why
On a Redis 8 Query Engine, int8-quantized vectors cut index memory
by ~75% and speed search ~30% with negligible recall loss. Today the
datatype is hardcoded to
float32in_build_redis_schemaand thewrite path, so there is no way to opt in. (
dims,distance_metric, andalgorithmare already settings-driven —this brings
datatypein line.)How
redisvl_datatypesetting, default"float32"._build_redis_schemareadssettings.redisvl_datatype;create_redis_memory_vector_dbpasses it to the DB.datatypeconstructor arg (defaultfloat32).Encoding/queries honor it: float types go through RedisVL's
array_to_buffer; int8 is quantized first via per-vector max-absscaling (RedisVL validates the int8 range but does not quantize),
which COSINE is invariant to. Query vectors are quantized to match
and the
VectorQuery/RangeQuerydtypeis set accordingly.Compatibility
Default is unchanged (
float32); existing deployments areunaffected. int8 requires the Redis 8 Query Engine (
TYPE INT8vector fields); older servers reject it at index creation.
Testing
default, schema datatype). Full
tests/test_memory_vector_db.pypasses (33).
redis:8+ Ollama(
nomic-embed-text, 768-dim): int8 index created, correctsemantic ranking; float32 default path unchanged.
Note
Medium Risk
Changes core long-term memory indexing and search encoding; wrong datatype or re-indexing without migration could break existing vector indexes, though the default path is unchanged.
Overview
Adds
redisvl_datatype(defaultfloat32) so long-term memory vector indexes are no longer hardcoded to float32. The Redis schema, factory, andRedisVLMemoryVectorDatabasenow honor the setting end-to-end: index creation, writes (array_to_buffer+ optional int8 per-vector max-abs quantization), and semantic/hybrid/recency queries (quantize query vectors and passdtypeonVectorQuery/RangeQuery/hybrid).Validation: Pydantic checks values against RedisVL
VectorDataType(case-normalized).int8/uint8require cosine distance at schema build time because quantization breaks other metrics.Default remains
float32; opting intoint8needs Redis 8 Query Engine support. New unit tests cover quantization, encoding, config, and schema rules.Reviewed by Cursor Bugbot for commit d0955b8. Bugbot is set up for automated code reviews on this repo. Configure here.